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Abstract 

We provide a framework for reasoning about information-hiding require- 
ments in multiagent systems and for reasoning about anonymity in particular. 
Our framework employs the modal logic of knowledge within the context of 
the runs and systems framework, much in the spirit of our earlier work on se- 
crecy Palpern and O'Neill 2002| . We give several definitions of anonymity 
with respect to agents, actions, and observers in multiagent systems, and we 
relate our definitions of anonymity to other definitions of information hid- 
ing, such as secrecy. We also give probabilistic definitions of anonymity 
that are able to quantify an observer's uncertainty about the state of the sys- 
tem. Finally, we relate our definitions of anonymity to other formalizations 
of anonymity and information hiding, including definitions of anonymity in 
the process algebra CSP and definitions of information hiding using function 
views. 



1 Introduction 



The primary goal of this paper is to provide a formal framework for reasoning about 
anonymity in multiagent systems. The importance of anonymity has increased over 
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the past few years as more communication passes over the Internet. Web-browsing, 
message-sending, and file-sharing are all important examples of activities that com- 
puter users would like to engage in, but may be reluctant to do unless they can re- 
ceive guarantees that their anonymity will be protected to some reasonable degree. 
Systems are being built that attempt to implement anonymity for various kinds of 
network communication (see, for example, IGoel, Robson, Polte, and Sirer 2002| 
von Ahn, Bortz, and Hopper 2003' Levine and Shields 2002'; Reiter and Rubin 1998' 
Sherwood, Bhattacharjee, and Srinivasan 2002„Syverson, Goldschlag, and Reed 1997J ). 
It would be helpful to have a formal framework in which to reason about the level 
of anonymity that such systems provide. 

We view anonymity as an instance of a more general problem: information 
hiding. In the theory of computer security, many of the fundamental problems and 
much of the research has been concerned with the hiding of information. Cryp- 
tography, for instance, is used to hide the contents of a message from untrusted 
observers as it passes from one party to another. Anonymity requirements are in- 
tended to ensure that the identity of the agent who performs some action remains 
hidden from other observers. Noninterference requirements essentially say that 
everything about classified or high-level users of a system should be hidden from 
low-level users. Privacy is a catch-all term that means different things to different 
people, but it typically involves hiding personal or private information from others. 

Information-hiding properties such as these can be thought of as providing an- 
swers to the following set of questions: 

• What information needs to be hidden? 

• Who does it need to be hidden from? 

• How well does it need to be hidden? 

By analyzing security properties with these questions in mind, it often becomes 
clear how different properties relate to each other. These questions can also serve 
as a test of a definition's usefulness: an information-hiding property should be able 
to provide clear answers to these three questions. 

In an earlier paper [Hal pem and O'Neill 20 02 1, we formalized secrecy in terms 
of knowledge. Our focus was on capturing what it means for one agent to have to- 
tal secrecy with respect to another, in the sense that no information flows from the 
first agent to the second. Roughly speaking, a high-level user has total secrecy if 
the low-level user never knows anything about the high-level user that he didn't 
initially know. Knowledge provides a natural way to express information-hiding 
properties — information is hidden from a if a does not know about it. Not sur- 
prisingly, our formalization of anonymity is similar in spirit to our formahzation 
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of secrecy. Our definition of secrecy says tfiat a classified agent maintains secrecy 
witli respect to an unclassified agent if the unclassified agent doesn't learn any new 
fact that depends only on the state of the classified agent. That is, if the agent 
didn't know a classified fact Lp to start with, then the agent doesn't know it at any 
point in the system. Our definitions of anonymity say that an agent performing an 
action maintains anonymity with respect to an observer if the observer never learns 
certain facts having to do with whether or not the agent performed the action. 

Obviously, total secrecy and anonymity are different. It is possible for i to have 
complete secrecy while still not having very strong guarantees of anonymity, for 
example, and it is possible to have anonymity without preserving secrecy. How- 
ever, thinking carefully about the relationship between secrecy and anonymity sug- 
gests new and interesting ways of thinking about anonymity. More generally, for- 
malizing anonymity and information hiding in terms of knowledge is useful for 
capturing the intuitions that practitioners have. 

We are not the first to use knowledge and belief to formalize notions of in- 
formation hiding. Glasgow, MacEwen, and Panangaden |1992| describe a logic 
for reasoning about security that includes both epistemic operators (for reasoning 
about knowledge) and deontic operators (for reasoning about permission and obU- 
gation). They characterize some security policies in terms of the facts that an agent 
is permitted to know. Intuitively, everything that an agent is not permitted to know 
must remain hidden. Our approach is similar, except that we specify the formulas 
that an agent is not allowed to know, rather than the formulas she is permitted to 
know. One advantage of accentuating the negative is that we do not need to use 
deontic operators in our logic. 

Epistemic logics have also been used to define information-hiding properties, 
including noninterference and anonymity. Gray and Syverson II998I use an epis- 
temic logic to define probabilistic noninterference, and Syverson and Stubblebine 
lir999,l use one to formalize definitions of anonymity. The thrust of our paper is 
quite different from these. Gray and Syverson focus on one particular definition 
of information hiding in a probabilistic setting, while Syverson and Stubblebine 
focus on describing an axiom system that is useful for reasoning about real-world 
systems, and on how to reason about and compose parts of the system into adver- 
saries and honest agents. Our focus, on the other hand, is on giving a semantic 
characterization of anonymity in a framework that lends itself well to modeling 
systems. 

Shmatikov and Hughes f20041 position their approach to anonymity (which 
is discussed in more detail in Section l53l as an attempt to provide an interface 
between logic-based approaches, which they claim are good for specifying the 
desired properties (like anonymity), and formalisms like CSP, which they claim 
are good for specifying systems. We agree with their claim that logic-based ap- 
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preaches are good for specifying properties of systems, but also claim that, with an 
appropriate semantics for the logic, there is no need to provide such an interface. 
While there are many ways of specifying systems, many end up identifying a sys- 
tem with a set of runs or traces, and can thus be embedded in the runs and systems 
framework that we use. 

Definitions of anonymity using epistemic logic are possibilistic. Certainly, if j 
believes that any of 1000 users (including i) could have performed the action that i 
in fact performed, then i has some degree of anonymity with respect to j. However, 
if j believes that the probability that i performed the action is .99, the possibilis- 
tic assurance of anonymity provides little comfort. Most previous formalizations 
of anonymity have not dealt with probability; they typically conclude with an ac- 
knowledgment that it is important to do so, and suggest that their formalism can 
indeed handle probability. One significant advantage of our formalism is that it is 
completely straightforward to add probability in a natural way, using known tech- 
niques |Halpem and Tuttle 1 993 1. As we show in Section |4l this lets us formalize 
the (somewhat less formal) definitions of probabilistic anonymity given by Reiter 
and Rubin |1998|. 

In this paper, we are more concerned with defining and specifying anonymity 
properties than with describing systems for achieving anonymity or with verifying 
anonymity properties. We want to define what anonymity means by using syntactic 
statements that have a well-defined semantics. Our work is similar in spirit to pre- 
vious papers that have given definitions of anonymity and other similar properties, 
such as the proposal for terminology given by Pfitzmann and Kohntopp [ 2001 1 and 
the information-theoretic definitions of anonymity given by Diaz, Seys, Claessens, 
andPreneel [2002|. 

The rest of this paper is organized as follows. In Section |2l we briefly review 
the runs and systems formalism of [Fag in, Halpem, Moses, an d Vardi 1995| and 
describe how it can be used to represent knowledge. In Section |3l we show how 
anonymity can be defined using knowledge, and relate this definition to other no- 
tions of information hiding, particularly secrecy (as defined in our earlier work). In 
Section 131 we extend the possibilistic definition of Section |3l so that it can capture 
probabilistic concerns. As others have observed | Hughe sand Shmatikov 20()4{ 
iReiter and Rubin 1998, |Syverson and Stubblebine 1999 1 , there are a number of 
ways to define anonymity. Some definitions provide very strong guarantees of 
anonymity, while others are easier to verify in practice. Rather than giving an ex- 
haustive list of definitions, we focus on a few representative notions, and show by 
example that our logic is expressive enough to capture many other notions of inter- 
est. In Section 121 we compare our framework to that of three other attempts to for- 
malize anonymity, by Schneider and Sidiropoulos [(199611 . Hughes and Shmatikov 
ll2004il . and Stubblebine and Syverson lil999J . We conclude in Section |6l 
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2 Multiagent Systems: A Review 



In this section, we briefly review the multiagent systems framework; we urge the 
reader to consult [ Fagin, Halpem, Moses, and Vardi 1995 J for more details. 

A multiagent system consists of n agents, each of which is in some local state 
at a given point in time. We assume that an agent's local state encapsulates all the 
information to which the agent has access. In the security setting, the local state 
of an agent might include initial information regarding keys, the messages she has 
sent and received, and perhaps the reading of a clock. The framework makes no 
assumptions about the precise nature of the local state. 

We can view the whole system as being in some global state, a tuple consisting 
of the local state of each agent and the state of the environment. Thus, a global 
state has the form (se, si , • • • , Sn), where Se is the state of the environment and Sj 
is agent i's state, for i = 1, . . . , n. 

A run is a function from time to global states. Intuitively, a run is a complete 
description of what happens over time in one possible execution of the system. A 
point is a pair (r, m) consisting of a run r and a time m. We make the standard 
assumption that time ranges over the natural numbers. At a point (r, m), the system 
is in some global state r{m). If r(m) = (se, si, . . . , then we take rj(m) to 
be Si, agent z's local state at the point {r,m). Note that an agent's local state at 
point (r, m) does not necessarily encode all the agent's previous local states. In 
some systems, agents have perfect recall, in the sense that their local state rj(m) 
encodes their states at times 0, . . . , m — 1, but this need not be generally true. 
(See [Fagin, Halpern, Moses, and Vardi 19*9^ Chapter 4] for a formal definition 
and discussion of perfect recall.) Formally, a system consists of a set of runs (or 
executions). Let V{Tl) denote the points in a system TZ. 

The runs and systems framework is compatible with many other standard ap- 
proaches for representing and reasoning about systems. For example, the runs 
might be event traces generated by a CSP process (see Section ls!2l . they might be 
message-passing sequences generated by a security protocol, or they might be gen- 
erated from the strands in a strand space [Halp em and Pucella 2001} Thayer, Herzog, and Guttman 1999 1 



The approach is rich enough to accommodate a variety of system representations. 

Another important advantage of the framework is that it it is straightforward to 
define formally what an agent knows at a point in a system. Given a system TZ, let 
/Ci(r, m) be the set of points in ViTZ) that i thinks are possible at (r, m), i.e., 

/Cj(r, m) = {{r',m') € V{TZ) : r[{m) = rj(m)}. 

Agent i knows a fact ip at a. point {r,m) if cp is true at all points in K,i{r,m). 
To make this intuition precise, we need to be able to assign truth values to basic 
formulas in a system. We assume that we have a set <I> of primitive propositions. 
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which we can think of as describing basic facts about the system. In the context of 
security protocols, these might be such facts as "the key is n" or "agent A sent the 
message m to B". An interpreted system 1 consists of a pair {TZ, n), where 7^ is a 
system and vr is an interpretation, which assigns to each primitive proposition in <I> 
a truth value at each point. Thus, for every p G $ and point (r, m) in TZ, we have 
(7r(r, € {true, false}. 

We can now define what it means for a formula Lp to be true at a point (r, m) 
in an interpreted system T, written (X, r, m) \= ip, by induction on the structure of 
formulas: 

• (T, r, m) 1= -p iff (vr(r, m)){p) = true 

• {T, r, m) \= iff {I, r,m) ^ (p 

• {Z,r,m) \= if f\ilj iff (T, r, m) \= Lp and (T, r, m) ^ V 

• (I, r, m) \= Knp iff (T, r', m') ^ (/? for all (r', m') G /Cj(r, m) 
As usual, we write Z \= (pif {I, r,m) \= ip for all points (r, m) in T. 

3 Defining Anonymity Using Knowledge 
3.1 Information-Hiding Definitions 

Anonymity is one example of an information-hiding requirement. Other information- 
hiding requirements include noninterference, privacy, confidentiality, secure message- 
sending, and so on. These requirements are similar, and sometimes they overlap. 
Noninterference, for example, requires a great deal to be hidden, and typically im- 
plies privacy, anonymity, etc., for the classified user whose state is protected by the 
noninterference requirement. 

In an earlier paper [Halpem and O'Neill 2002| , we looked at requirements of 
total secrecy in multiagent systems. Total secrecy basically requires that in a sys- 
tem with "classified" and "unclassified" users, the unclassified users should never 
be able to infer the actions or the local states of the unclassified users. For secrecy, 
the "what needs to be hidden" component of information-hiding is extremely re- 
strictive: total secrecy requires that absolutely everything that a classified user does 
must be hidden. The "how well does it need to be hidden" component depends on 
the situation. Our definition of secrecy says that for any nontrivial fact Lp (that 
is, one that is not already valid) that depends only the state of the classified or 
high-level agent, the formula -^KjLp must be valid. (See our earlier paper for more 
discussion of this definition.) Semantically, this means that whatever the high-level 
user does, there exists some run where the low-level user's view of the system is 
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the same, but the high-level user did something different. Our nonprobabilistic def- 
initions are fairly strong (simply because secrecy requires that so much be hidden). 
The probabilistic definitions we gave require even more: not only can the agent not 
learn any new classified fact, but he also cannot learn anything about the probabil- 
ity of any such fact. (In other words, if an agent initially assigns a classified fact Lp 
a probability a of being true, he always assigns (/? that probability.) It would be per- 
fectly natural, and possibly quite interesting, to consider definitions of secrecy that 
do not require so much to be hidden (e.g., by allowing some classified information 
to be declassified [|Zdancewic and Myers 2001] ), or to discuss definitions that do 
not require such strong secrecy (e.g., by giving definitions that were stronger than 
the nonprobabilistic definitions we gave, but not quite so strong as the probabilistic 
definitions). 

3.2 Defining Anonymity 

The basic intuition behind anonymity is that actions should be divorced from the 
agents who perform them, for some set of observers. With respect to the basic 
information-hiding framework outlined above, the information that needs to be 
hidden is the identity of the agent (or set of agents) who perform a particular action. 
Who the information needs to be hidden from, i.e., which observers, depends on 
the situation. The third component of information-hiding requirements — how well 
information needs to be hidden — will often be the most interesting component of 
the definitions of anonymity that we present here. 

Throughout the paper, we use the formula 9{i, a) to represent "agent i has 
performed action a, or will perform a in the future".^ For future reference, let 
5{i, a) represent "agent i has performed action a". Note that 9{i, a) is a fact about 
the run: if it is true at some point in a run, it is true at all points in a run (since it is 
true even if i performs a at some point in the future). On the other hand, 5{i, a) may 
be false at the start of a run, and then become true at the point where i performs a. 

It is not our goal in this paper to provide a "correct" definition of anonymity. 
We also want to avoid giving an encyclopedia of definitions. Rather, we give some 
basic definitions of anonymity to show how our framework can be used. We base 
our choice of definitions in part on definitions presented in earlier papers, to make 
clear how our work relates to previous work, and in part on which definitions of 
anonymity we expect to be useful in practice. We first give an extremely weak def- 
inition, but one that nonetheless illustrates the basic intuition behind any definition 

'if we want to consider systems that may crash we may want to consider 6'{i, a) instead, where 
9' (i, a) represents "agent i has performed action a, or will perform a in the future if the system does 
not crash". Since issues of failure are orthogonal to the anonymity issues that we focus on here, we 
consider only the simpler definition in this paper. 
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of anonymity. 



Definition 3.1: Action a, performed by agent i, is minimally anonymous with re- 
spect to agent j in the interpreted system I,ifl\= -^Kj [0{i, a)]. | 

This definition makes it clear what is being hidden (6{i,a) — the fact that i 
performs a) and from whom (j). It also describes how well the information is 
hidden: it requires that j not be sure that i actually performed, or will perform, the 
action. Note that this is a weak information-hiding requirement. It might be the 
case, for example, that agent j is certain that the action was performed either by 
i, or by at most one or two other agents, thereby making i a "prime suspect". It 
might also be the case that j is able to place a very high probability on i performing 
the action, even though he isn't absolutely certain of it. (Agent j might know 
that there is some slight probability that some other agent i' performed the action, 
for example.) Nonetheless, it should be the case that for any other definition of 
anonymity we give, if we want to ensure that i's performing action a is to be kept 
anonymous as far as observer j is concerned, then i's action should be at least 
minimally anonymous with respect to j. 

Our definition of a being minimally anonymous with respect to j is equivalent 
to the apparently weaker requirement I \= 6{i,a) =^ -'Kj[9{i,a)], which says 
that if action a is performed by i, then j does not not know it. Clearly if j never 
knows that a is performed by i, then j will never know that a is performed by i if 
i actually does perform a. To see that the converse holds, it suffices to note that 
if i does not perform a, then surely ^Kj[6{i, a)] holds. Thus, this definition, like 
several that will follow, can be viewed as having the form "if i performed a, then j 
does not know some appropriate fact". 

The definition of minimal anonymity also makes it clear how anonymity relates 
to secrecy, as defined in our earlier work [Hal pern and O'Neill 2002 J . To explain 
how, we first need to describe how we defined secrecy in terms of knowledge. 
Given a system I, say that ip is nontrivial in I if T ^ ip, and that ip depends only 
on the local state of agent i in I if 1 \= ip => Knp. Intuitively, (p is nontrivial in 
I if (p could be false in Z, and ip depends only on i's local state if i always knows 
whether or not ip is true. (It is easy to see that (/? depends only on the local state of i 
if (X, r,m) \= ip and ri{m) = r'^{m') implies that (T, r', m') \= ip.) According to 



the definition in | Halpern and O'Neill 2002 1, agent i maintains total secrecy with 



respect to another agent j in system I if for every nontrivial fact Lp that depends 
only on the local state of i, the formula -^Kjip is valid for the system. That is, i 
maintains total secrecy with respect to j if j does not learn anything new about 
agent i's state. In general, 9{i, a) does not depend only on i's local state, because 
whether i performs a may depend on whether or not i gets a certain message from 
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some other agent i' . On the other hand, if whether or not i performs a depends 
only on z's protocol, and the protocol is encoded in z's local state, then 6{i,a) 
depends only on i's local state. If 6{i, a) does depend only on i's local state and j 
did not know all along that i was going to perform action a (i.e., if we assume that 
6{i, a) is nontrivial), then Definition B.ll is clearly a special case of the definition 
of secrecy. In any case, it is in much the same spirit as the definition of secrecy. 
Essentially, anonymity says that the fact that agent i has or will perform action a 
must be hidden from j, while total secrecy says that all facts that depend on agent 
i must be hidden from j. 

Note that this definition of minimal anonymity is different from the one we 
gave in the conference version of this paper [Halpem and O'Neill 200? |. There, 
the definition given used 5{i, a) rather than a). We say that o performed by 
agent i is minimally (5-anonymous if Definition 13. II holds, with 9{i,a) replaced 
by 6{i, a). It is easy to see that minimal anonymity implies minimal (5-anonymity 
(since 6{i, a) implies 9{i, a)), but the converse is not true in general. For example, 
suppose that j gets a signal if i is going to perform action a (before i actually 
performs the action), but then never finds out exactly when i performs a. Then 
minimal anonymity does not hold. In runs where i performs a, agent j knows that 
i will perform a when he gets the signal. On the other hand, minimal (5-anonymity 
does hold, because j never knows when i performs a. In this situation, minimal 
anonymity seems to capture our intuitions of what anonymity should mean better 
than minimal (5-anonymity does. 

The next definition of anonymity we give is much stronger. It requires that if 
some agent i performs an action anonymously with respect to another agent j, then 
j must think it possible that the action could have been performed by any of the 
agents (except for j). Let Pjip be an abbreviation for ^Kj-tip. The operator Pj is 
the dual of Kj; intuitively, Pjip means "agent j thinks that ip is possible". 

Definition 3.2: Action a, performed by agent i, is totally anonymous with respect 
to j in the interpreted system Z if 

I^e{i,a)^ f\ Pj[9{i',a)]. 

I 

Definition 13 . 21 captures the notion that an action is anonymous if, as far as the 
observer in question is concerned, it could have been performed by anybody in the 
system. 

Again, in the conference version of the paper, we defined total anonymity us- 
ing 6{i, a) rather than 6{i, a). (The same remark holds for all the other definitions 
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of anonymity that we give, although we do not always say so explicitly.) Let total 
5-anonymity be the anonymity requirement obtained when 9{i,a) is replaced by 
5{i, a). It is not hard to show that if agents have perfect recall (which intuitively 
means that their local state keeps track of all the actions they have performed — 
see |Fagin, Halpem, Moses, and Vardi 19^ for the formal definition), then total 
J-anonymity implies total anonymity. This is not true, in general, without perfect 
recall, because it might be possible for some agent to know that i will perform ac- 
tion a — and therefore that no other agent will — but forget this fact by the time that 
i actually performs a. Similarly, total anonymity does not imply total 5-anonymity. 
To see why, suppose that the agents are numbered 1, . . . , n, and that an outside 
observer knows that if j performs action a, then j will perform it at time j. Then 
total anonymity may hold even though total (^-anonymity does not. For example, 
at time 3, although the observer may consider it possible that agent 4 will perform 
the action (at time 4), he cannot consider it possible that 4 has already performed 
the action, as required by total 5-anonymity. 

Chaum [1988| showed that total anonymity could be obtained using DC-nets. 
Recall that in a DC-net, a group of n users use Chaum's dining cryptographer's 
protocol (described in the same paper) to achieve anonymous communication. If 
we model a DC-net as an interpreted multiagent system X whose agents consist 
exclusively of agents participating in a single DC-net, then if an agent i sends 
a message using the DC-net protocol, that action is totally anonymous. (Chaum 
proves this, under the assumption that any message could be generated by any user 
in the system.) Note that in the dining cryptographer's example, total anonymity 
and 5-total anonymity agree, because who paid is decided before the protocol starts. 

It is easy to show that if an action is totally anonymous, then it must be mini- 
mally anonymous as well, as long as two simple requirements are satisfied. First, 
there must be at least 3 agents in the system. (A college student with only one 
roommate can't leave out her dirty dishes anonymously, but a student with at least 
two roommates might be able to.) Second, it must be the case that a can be per- 
formed only once in a given run of the system. Otherwise, it might be possible for 
j to think that any agent i' ^ i could have performed a, but for j to know that agent 
i did, indeed, perform a. For example, consider a system with three agents besides 
j. Agent j might know that all three of the other agents performed action a. In that 
case, in particular, j knows that i performed a, so action a performed by i is not 
minimally anonymous with respect to j, but is totally anonymous. We anticipate 
that this assumption will typically be met in practice. It is certainly consistent with 
examples of anonymity given in the literature. (See, for example, [Chaum 1988J 
Schneider and Sidiropoulos I996| ). In any case, if it is not met, it is possible to tag 
occurrences of an action (so that we can talk about the A;th time a is performed). 
Thus, we can talk about the ith occurrence of an action being anonymous. Be- 
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cause the ith occurrence of an action can only happen once in any given run, our 
requirement is satisfied. 

Proposition 3.3: Suppose that there are at least three agents in the interpreted 
system I and that 

If action a, performed by agent i, is totally anonymous with respect to j, then it is 
minimally anonymous as well. 

Proof: Suppose that action a is totally anonymous. Because there are three agents 
in the system, there is some agent i' other than i and j, and by total anonymity, X |= 

e{i,a) =^ Pj[9{i',a)]. If (X,r,m) \= ^9{i, a), clearly {I,r,m) \= -nKj[9{i,a)]. 
Otherwise, (T, r, m) \= Pj[6{i', a)] by total anonymity. Thus, there exists a point 
(r', m') such that rj{m') = rj{m) and {I, r', m') \= 6{i', a). By our assumption, 
(X, r',m') \= a), because z ^ i'. Therefore, {Z,r,m) \= -i/Cj[0(i, a)]. It 
follows that a is minimally anonymous with respect to j. | 

Definitions 13 . 1 1 and 13 . 21 are conceptually similar, even though the latter defini- 
tion is much stronger. Once again, there is a set of formulas that an observer is 
not allowed to know. With the earlier definition, there is only one formula in this 
set: 6{i, a). As long as j doesn't know that i performed action a, this requirement 
is satisfied. With total anonymity, there are more formulas that j is not allowed 
to know: they take the form -'6{i',a). Before, we could guarantee only that j 
did not know that i did the action; here, for many agents i', we guarantee that j 
does not know that i' did not do the action. The definition is made slightly more 
complicated by the implication, which restricts the conditions under which j is not 
allowed to know -'6{i', a). (If i didn't actually perform the action, we don't care 
what j thinks, since we are concerned only with anonymity with respect to i.) But 
the basic idea is the same. 

Note that total anonymity does not necessarily follow from total secrecy, be- 
cause the formula -'6{i', a), for i' / i, does not, in general, depend only on the 
local state of i. It is therefore perfectly consistent with the definition of total se- 
crecy for j to learn this fact, in violation of total anonymity. (Secrecy, of course, 
does not follow from anonymity, because secrecy requires that many more facts be 
hidden than simply whether i performed a given action.) 

Total anonymity is a very strong requirement. Often, an action will not be 
totally anonymous, but only anonymous up to some set of agents who could have 
performed the action. This situation merits a weaker definition of anonymity. To be 
more precise, let I be the set of all agents of the system and suppose that we have 
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some set I a C / — an "anonymity set", using the terminology of Chaum 119881 
and Pfitzmann and Kohntopp 120011 — of agents who can perform some action. 
We can define anonymity in terms of this set. 

Definition 3.4: Action a, performed by agent i, is anonymous up to I a I with 
respect to j if 

Z^e{i,a)^ /\ PAOii'^a)]. 

I 

In the anonymous message-passing system Herbivore [Goel, Robson, Polte, and Sirer 2002| , 
users are organized into cliques Ci, . . . , C^, each of which uses the dining cryp- 
tographers protocol [Chaum 1988 1 for anonymous message-transmission. If a user 
wants to send an anonymous message, she can do so through her clique. Herbi- 
vore claims that any user i is able to send a message anonymously up to Cj, where 
i £ Cj. As the size of a user's clique varies, so does the strength of the anonymity 
guarantees provided by the system. 

In some situations, it is not necessary that there be a fixed anonymity set, as in 
Definition 13 .41 It suffices that, at all times, there exists some anonymity set with at 
least, say, k agents. This leads to a definition of /c-anonymity. 

Definition 3.5: Action a, performed by agent i, is k-anonymous with respect to j 
if 

{lA.\lA\=k}i'elA 

I 

This definition says that at any point j must think it possible that any of at 
least k agents might perform, or have performed, the action. Note that the set of 
k agents might be different in different runs, making this condition strictly weaker 
than anonymity up to a particular set of size k. 

A number of systems have been proposed that provide A;-anonymity for some 
k. In the anonymous communications network protocol recently proposed by von 
Ahn, Bortz, and Hopper |von Ahn, Bortz, and Hopper 2003| |, users can send mes- 
sages with guarantees of A:-anonymity. In the system (for "Peer-to-Peer Per- 
sonal Privacy Protocol") [Sherwood, Bhattacharjee, and Srinivasan 2002} , users join 
a logical broadcast tree that provides anonymous communication, and users can 
choose what level of A;-anonymity they want, given that fc-anonymity for a higher 
value of k makes communication more inefficient. Herbivore [Goel, Robson, Polte, and Sirer 2002| 
provides anonymity using cliques of DC-nets. If the system guarantees that the 
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cliques all have a size of at least k, so that regardless of cUque composition, there 
are at least k users capable of sending any anonymous message, then Herbivore 
guarantees /c-anonymity. 

3.3 A More Detailed Example: Dining Cryptographers 

A well-known example of anonymity in the computer security literature is Chaum's 
"dining cryptographers problem" |Chaum 1988|. In the original description of this 
problem, three cryptographers sit down to dinner and are informed by the host that 
someone has already paid the bill anonymously. The cryptographers decide that 
the bill was paid either by one of the three people in their group, or by an outside 
agency such as the NSA. They want to find out which of these two situations is 
the actual one while preserving the anonymity of the cryptographer who (might 
have) paid. Chaum provides a protocol that the cryptographers can use to solve 
this problem. To guarantee that it works, however, it would be nice to check that 
anonymity conditions hold. Assuming we have a system that includes a set of three 
cryptographer agents C = {0, 1,2}, as well as an outside observer agent o, the 
protocol should guarantee that for each agent i G C, and each agent j £ C — {i}, 
the act of paying is anonymous up to C — {j} with respect to j. For an outside 
observer o, i.e., an agent other than one of three cryptographers, the protocol should 
guarantee that for each agent i G C, the protocol is anonymous up to C with respect 
to o. This can be made precise using our definition of anonymity up to a set. 

Because the requirements are symmetric for each of the three cryptographers, 
we can describe the anonymity specification compactly by naming the agents us- 
ing modular arithmetic. We use to denote addition mod 3. Let the interpreted 
system {I = {TZ, n) represent the possible runs of one instance of the dining cryp- 
tographers protocol, where the interpretation vr interprets formulas of the form 
0(i,"paid") in the obvious way. The following knowledge-based requirements 
comprise the anonymity portion of the protocol's specification, for each agent 
i G C: 

I ^ e{i, "paid") ^ Pi(Bie{i e 2, "paid") A Pi(s20{i 1, "paid") 
A PoOii e 1, "paid") A PoOii e 2, "paid"). 

This means that if a cryptographer paid, then each of the other cryptographers 
must think it possible that the third cryptographer could have paid. In addition, an 
outside observer must think it possible that either of the other two cryptographers 
could have paid. 
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4 Probabilistic Variants of Anonymity 



4.1 Probabilistic Anonymity 

All of the definitions presented in Section |3] were nonprobabilistic. As we men- 
tioned in the introduction, this is a serious problem for the "how well is informa- 
tion hidden" component of the definitions. For all the definitions we gave, it was 
necessary only that observers think it possible that multiple agents could have per- 
formed the anonymous action. However, an event that is possible may nonetheless 
be extremely unlikely. Consider our definition of total anonymity (Definition l3.2t . 
It states that an action performed by i is totally anonymous if the observer j thinks 
it could have been performed by any agent other than j. This may seem like a 
strong requirement, but if there are, say, 102 agents, and j can determine that i per- 
formed action a with probability 0.99 and that each of the other agents performed 
action a with probability 0.0001, agent i might not be very happy with the guaran- 
tees provided by total anonymity. Of course, the appropriate notion of anonymity 
will depend on the application: i might be content to know that no agent can prove 
that she performed the anonymous action. In that case, it might suffice for the 
action to be only minimally anonymous. However, in many other cases, an agent 
might want a more quantitative, probabilistic guarantee that it will be considered 
reasonably likely that other agents could have performed the action. 

Adding probability to the runs and systems framework is straightforward. The 
approach we use goes back to [Halpern and Tuttle 1993] , and was also used in our 
work on secrecy |Halpem and O'Neill 2002| , so we just briefly review the relevant 
details here. Given a system IZ, suppose we have a probability measure /i on the 
runs of TZ. The pair {TZ, //) is a probabilistic system. For simplicity, we assume that 
every subset of TZ is measurable. We are interested in the probability that an agent 
assigns to an event at the point (r, m). For example, we may want to know that 
at the point (r, m), observer i places a probability of 0.6 on j's having performed 
some particular action. We want to condition the probability /i on }Ci{r,m), the 
information that i has at the point (r, m). The problem is that /Cj(r, m) is a set of 
points, while is a probability on runs. This problem is dealt with as follows. 

Given a set U of points, let TZ{U) consist of the runs in TZ going through a 
point in U. That is, 

TZ{U) = {r £ TZ : {r, m) £ U for some m}. 

The idea will be to condition jj, on TZ{ICi{r, m)) rather than on /Cj(r, m). To make 
sure that conditioning is well defined, we assume that fi{TZ{ICi{r,m))) > for 
each agent i, run r, and time m. That is, fi assigns positive probability to the set of 
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runs in TZ compatible with what happens in run r up to time m, as far as agent i is 
concerned. 

With this assumption, we can define a measure //r-,m,« on the points in /Ci(r, m) 
as follows. If 5 C 7^, define /Cj(r, m){S) to be the set of points in /Cj(r, m) that 
lie on runs in S; that is, 

}Ci{r,m){S) = {{r',m') € /Cj(r, m) : r' £ S}. 

Let J-'r,m,i, the measurable subsets of /Cj(r, m) (that is, the sets to which ^r,m,i 
assigns a probability), consist of all sets of the form /Ci(r, m){S), where S Q TZ. 
Then define Hr,rri,i{K^i{r,m){S)) = fj.{S \ TZ{}Ci{r,m)). It is easy to check that 
Air,m,j is a probability measure, essentially defined by conditioning. 

Define a probabilistic interpreted system I to be a tuple {TZ,^,it), where 
{TZ, fi) is a probabilistic system. In a probabilistic interpreted system, we can give 
semantics to syntactic statements of probability. Following |Fagin, Halpern, and Megiddo 1990| , 
we will be most interested in formulas of the form PTi{ip) < a (or similar formulas 
with >, <, or = instead of <). Intuitively, a formula such as FTi{ip) < a is true 
at a point (r, m) if, according to iir,m,i, the probability that (/? is true is at most a. 
More formally, (X, r, m) \= Pri((/7) < a if 

iJi-r,m,i{{{r' ,m') £ JCi{r,m) : {I,r',m') \= cp}) < a. 

Similarly, we can give semantics to Prj((^) < a and Pr((/9) = a, as well as con- 
ditional formulas such as Fr{ip | ^) < a. Note that although these formulas talk 
about probability, they are either true or false at a given state. 

It is straightforward to define probabilistic notions of anonymity in probabilis- 
tic systems. We can think of Definition 13. II for example, as saying that j's prob- 
ability that i performs the anonymous action a must be less than 1 (assuming that 
every nonempty set has positive probability). This can be generalized by specifying 
some a < I and requiring that the probability of 9{i, a) be less than a. 

Definition 4.1: Action a, performed by agent i, is a-anonymous with respect to 
agent i if J ^ e{i,a) =^ Fi j[e{i, a)] < a. I 

Note that if we replace 9{i, a) by 6{i, a) in Definition 14. II the resulting notion 
might not be well defined. The problem is that the set 

{(/, m') G ICi{r, m) : (I, r', m!) \= 5{i, a)} 

may not be measurable; it may not have the form /Ci(r, m){S) for some S Q IZ. 
The problem does not arise if Z is a synchronous sytem (in which case i knows that 
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time, and all the points in /Ci(r, m) are of the form (r', m)), but it does arise if 2 
is asynchronous. We avoid this technical problem by working with 6{i,a) rather 
than 5{i,a). 

Definition 14. II unlike Definition 13. II includes an implication involving 6{i,a). 
It is easy to check that Definition 13.11 does not change when such an implication 
is added; intuitively, if 6{i,a) is false then ^Kj[6{i,a)] is trivially true. Defini- 
tion 14.11 however, would change if we removed the implication, because it might 
be possible for j to have a high probability of 6'(f, a) even though it isn't true. We 
include the implication because without it, we place constraints on what j thinks 
about 6{i,a) even if i has not performed the action a and will not perform it in the 
future. Such a requirement, while interesting, seems more akin to "unsuspectibil- 
ity" than to anonymity. 

Two of the notions of probabilistic anonymity considered by Reiter and Ru- 
bin (1998 1 in the context of their Crowds system can be understood in terms of 
a-anonymity. Reiter and Rubin say that a sender has probable innocence if, from 
an observer's point of view, the sender "appears no more likely to be the originator 
than to not be the originator". This is simply 0.5-anonymity. (Under reasonable as- 
sumptions. Crowds provides 0.5-anonymity for Web requests.) Similarly, a sender 
has possible innocence if, from an observer's point of view, "there is a nontriv- 
ial probability that the real sender is someone else". This corresponds to minimal 
anonymity (as defined in Section ll!2l . or to e-anonymity for some nontrivial value 
of e. 

It might seem at first that Definition l4. 1 I should be the only definition of anonymity 
we need: as long as j's probability of i performing the action is low enough, i 
should have nothing to worry about. However, with further thought, it is not hard 
to see that this is not the case. 

Consider a scenario where there are 1002 agents, and where a = 0.11. Sup- 
pose that the probability, according to Alice, that Bob performs the action is .1, but 
that her probability that any of the other 1000 agents performs the action is 0.0009 
(for each agent). Alice's probability that Bob performs the action is small, but her 
probability that anyone else performs it is more than three orders of magnitude 
smaller. Bob is obviously the prime suspect. 

This concern was addressed by Serjantov and Danezis 120021 in their paper on 
information-theoretic definitions of anonymity. They consider the probability that 
each agent in an anonymity set is the sender of some anonymous message, and 
use entropy to quantify the amount of information that the system is leaking; Diaz 
et al. [20021 and Danezis 120031 use similar techniques. In this paper we are not 
concerned with quantitative measurements of anonymity, but we do agree that it 
is worthwhile to consider stronger notions of anonymity than the nonprobabilistic 
definitions, or even a-anonymity, can provide. We hope to examine quantitative 
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definitions in future work. 

The next definition strengtliens Definition 14. II in the way that Definition 13.21 
strengthens Definition 13. II It requires that no agent in the anonymity set be a more 
likely suspect than any other. 

Definition 4.2: Action a, performed by agent i, is strongly probabilistically anony- 
mous up to I A with respect to agent j if for each i' ^ Ia, 

I \= e{i,a) =^ Pij[e{i,a)] = Prj[9{i',a)]. 

I 

Depending on the size of Ia, this definition can be extremely strong. It does 
not state simply that for all agents in I a, the observer must think it is reasonably 
likely that the agent could have performed the action; it also says that the observer's 
probabilities must be the same for each such agent. Of course, we could weaken 
the definition somewhat by not requiring that all the probabilities be equal, but 
by instead requiring that they be approximately equal (i.e., that their difference be 
small or that their ratio be close to 1). Reiter and Rubin 1 1998 1, for example, say 
that the sender of a message is beyond suspicion if she "appears no more likely to 
be the originator of that message than any other potential sender in the system". In 
our terminology, i is beyond suspicion with respect to j if for each i' £ Ia, 

I ^ 0{i,a) ^ Fi j[e{i, a)] < Pvj[e{i,a)]. 

This is clearly weaker than strong probabilistic anonymity, but still a very strong 
requirement, and perhaps more reasonable, too. Our main point is that a wide 
variety of properties can be expressed clearly and succinctly in our framework. 

4.2 Conditional Anonymity 

While we have shown that many useful notions of anonymity — including many 
definitions that have already been proposed — can be expressed in our framework, 
we claim that there are some important intuitions that have not yet been captured. 
Suppose, for example, that someone makes a $5,000,000 donation to Cornell Uni- 
versity. It is clearly not the case that everyone is equally likely, or even almost 
equally likely, to have made the donation. Of course, we could take the anonymity 
set Ia to consist of those people who might be in a position to make such a large 
donation, and insist that they all be considered equally likely. Unfortunately, even 
that is unreasonable: a priori, some of them may already have known connections 
to Cornell, and thus be considered far more likely to have made the donation. AU 
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that an anonymous donor can reasonably expect is that nothing an observer learns 
from his interactions with the environment (e.g., reading the newspapers, noting 
when the donation was made, etc.) will give him more information about the iden- 
tity of the donor than he already had. 

For another example, consider a conference or research journal that provides 
anonymous reviews to researchers who submit their papers for publication. It is 
unlikely that the review process provides anything like a-anonymity for a small a, 
or strongly probabilistic anonymity up to some reasonable set. When the prelim- 
inary version of this paper, for example, was accepted by the Computer Security 
Foundations Workshop, the acceptance notice included three reviews that were, in 
our terminology, anonymous up to the program committee. That is, any one of the 
reviews we received could have been written by any of the members of the program 
committee. However, by reading some of the reviews, we were able to make fairly 
good guesses as to which committee members had provided which reviews, based 
on our knowledge of the specializations of the various members, and based on the 
content of the reviews themselves. Moreover, we had a fairly good idea of which 
committee members would provide reviews of our paper even before we received 
the reviews. Thus, it seems unreasonable to hope that the review process would 
provide strong probabilistic anonymity (up to the program committee), or even 
some weaker variant of probabilistic anonymity. Probabilistic anonymity would 
require the reviews to convert our prior beliefs, according to which some program 
committee members were more likely than others to be reviewers of our paper, to 
posterior beliefs according to which all program committee members were equally 
likely ! This does not seem at all reasonable. However, the reviewers might hope 
that that the process did not give us any more information than we already had. 

In our paper on secrecy |Halpem and O'Neill 2002 1, we tried to capture the 
intuition that, when an unclassified user interacts with a secure system, she does 
not learn anything about any classified user that she didn't already know. We did 
this formally by requiring that, for any three points (r, m), (r', m'), and (r", m"), 

fJ'{r,m,j){}<^i{r",'rn")) = ^(,,/_„/j)(/Cj(r", m")). (1) 

That is, whatever the unclassified user j sees, her probability of any particular 
classified state will remain unchanged. 

When defining anonymity, we are not concerned with protecting all informa- 
tion about some agent i, but rather the fact that i performs some particular action 
a. Given a probabilistic system 1 = {IZ, vr, /x) and a formula ip, let er{^p) consist 
of the set of runs r such that Lp is true at some point in r, and let ep{tp) be the set of 
points where ip is true. That is 

Crif) = {r : 3m{{I,r,m) \= (p)}, 
ep{v) = {{r,m) : {I,r,m) \= p}. 
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The most obvious analogue to (Q is the requirement that, for all points (r, m) and 

(r', m'), 

IJ'(r,m,j){ep{9{i,a))) = ^(j' ,m' ,j){ep{9{i,a))). 

This definition says that j never learns anything about the probability that i per- 
formed performs a: she always ascribes the same probability to this event. In the 
context of our anonymous donation example, this would say that the probability 
(according to j) of i donating $5,000,000 to Cornell is the same at all times. 

The problem with this definition is that it does not allow j to learn that some- 
one donated $5,000,000 to Cornell. That is, before j learned that someone donated 
$5,000,000 to Cornell, j may have thought it was unlikely that anyone would do- 
nate that much money to Cornell. We cannot expect that j's probability of i do- 
nating $5,000,000 would be the same both before and after learning that someone 
made a donation. We want to give a definition of conditional anonymity that allows 
observers to learn that an action has been performed, but that protects — as much as 
possible, given the system — the fact that some particular agent performed performs 
the action. If, on the other hand, the anonymous action has not been performed, 
then the observer's probabilities do not matter. 

Suppose that i wants to perform action a, and wants conditional anonymity 
with respect to j. Let 9{j,a) represent the fact that a has been performed by 
some agent other than j, i.e., 9{j,a) = \Jii^j9{i' ^a). The definition of condi- 
tional anonymity says that j's prior probability of 9{i, a) given 9{j, a) must be the 
same as his posterior probability of 9{i, a) at points where j knows 9{j, a), i.e., at 
points where j knows that someone other than j has performed (or will perform) 
a. Let a = ii{er{9{i, a)) \ er{9{j, a))). This is the prior probability that i has per- 
formed a, given that somebody other than j has. Conditional anonymity says that 
at any point where j knows that someone other than j performs a, j's probability 
of 9{i, a) must be a. In other words, j shouldn't be able to learn anything more 
about who performs a (except that somebody does) than he know before he began 
interacting with the system in the first place. 

Definition 4.3: Action a, performed by agent i, is conditionally anonymous with 
respect to j in the probabilistic system 1 if 

I h Kj9{j, a) ^ FTji9{i, a)) = /z(e,(0(i, a)) | e,(e(j, a))). 

I 

Note that if only one agent ever performs a, then a is trivially conditionally anony- 
mous with respect to j, but may not be minimally anonymous with respect to j. 
Thus, conditional anonymity does not necessarily imply minimal anonymity. 
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In Definition we implicitly assumed that agent j was allowed to learn that 
someone other than j performed action a; anonymity is intended to hide which 
agent performed a, given that somebody did. More generally, we believe that we 
need to consider anonymity with respect to what an observer is allowed to learn. 
We might want to specify, for example, that an observer is allowed to know that 
a donation was made, and for how much, or to learn the contents of a conference 
paper review. The following definition lets us do this formally. 

Definition 4.4: Action a, performed by agent i, is conditionally anonymous with 
respect to j and ip in the probabilistic system 1 if 

I \= Kjip =^ Prj(0(i,a)) = n{er{0{i, a)) \ er{ip))- 

I 

Definition 14.31 is clearly the special case of Definition 14.41 where ip = 6{j,a). 
Intuitively, both of these definitions say that once an observer learns some fact if 
connected to the fact 6{i,a), we require that she doesn't learn anything else that 
might change her probabilities of6{i,a). 

4.3 Example: Probabilistic Dining Cryptographers 

Returning the dining cryptographers problem, suppose that it is well-known that 
one of the three cryptographers at the table is much more generous than the other 
two, and therefore more likely to pay for dinner. Suppose, for example, that the 
probability measure on the set of runs where the generous cryptographer has paid 
is 0.8, given that one of the cryptographers paid for dinner, and that it is 0.1 for 
each of the other two cryptographers. Conditional anonymity for each of the three 
cryptographers with respect to an outside observer means that when such observer 
learns that one of the cryptographers has paid for dinner, his probability that any 
of the three cryptographers paid should remain 0.8, 0.1, and 0.1. If the one of the 
thrifty cryptographers paid, the generous cryptographer should think that there is 
a probability of 0.5 = 0.1/(0.1 + 0.1) that either of the others paid. Likewise, 
if the generous cryptographer paid, each of the others should think that there is a 
probability of 0.8/ (0.8 + 0.1) that the generous cryptographer paid and a probabil- 
ity of 0.1/(0.8 + 0.1) that the other thrifty cryptographer paid. We can similarly 
calculate all the other relevant probabilities. 

More generally, suppose that we have an intepreted probabilistic system {TZ, n, vr) 
that represents instances of the dining cryptographers protocol, where the interpre- 
tation vr once again interprets formulas of the form 0(i,"paid") and 6'(7, "paid") 
in the obvious way, and where the formula 7 is true if one of the cryptographers 
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paid. (That is, 7 is equivalent to Vie{o,i,2} ^(i, "paid").) For any cryptographer 
i G {0, 1, 2}, let a{i) be the prior probability that i paid, given that somebody else 
did. That is, let 

a{i) = /i(er(6'(i,"paid")) | 6^.(7)). 

In the more concrete example given above, if is the generous cryptographer, we 
would have a(0) = 0.8 and a(l) = a(2) = 0.1. 

For the purposes of conditional probability with respect to an agent j, we are 
interested in the probability that some agent i paid, given that somebody other than 
j paid. Formally, for i ^ j, let 

a(i,j) =/i(e,(0(i, "paid")) | 6,(0(7, "paid"))). 

If an observer o is not one of the three cryptographers, than o didn't pay, and we 
have a{i,o) = a{i). Otherwise, if i,j G {0,1,2}, we can use conditioning to 
compute a{i,j): 

^ '-^^ a(j e 1) + a(j © 2) ■ 

(Once again, we make our definitions and requirements more compact by using 
modular arithmetic, where denotes addition mod 3.) 

The following formula captures the requirement of conditional anonymity in 
the dining cryptographer's protocol, for each cryptographer i, with respect to the 
other cryptographers and any outside observers. 

I \= [Ki®i6'(I©T,"paid") => Priei(6'(i,"paid")) = a{i,i(B 1)] A 
[Ki^20{TW2rpaid") ^ Prie2(0(«, "paid")) = a{i,i(B 2)] A 
[iro0(o, "paid") ^ Pro (e(i, "paid")) = a{i,o)] . 

Chaum's original proof that the dining cryptographers protocol provides anonymity 
actually proves conditional anonymity in this general setting. Note that if the prob- 
ability that one of the cryptographers will pay is 1, that cryptographer will have 
conditional anonymity even though he doesn't even have minimal anonymity. 

4.4 Other Uses for Probability 

In the previous two subsections, we have emphasized how probability can be used 
to obtain definitions of anonymity stronger than those presented in Section|3] How- 
ever, probabilistic systems can also be used to define interesting ways of weakening 
those definitions. Real-world anonymity systems do not offer absolute guarantees 
of anonymity such as those those specified by our definitions. Rather, they guaran- 
tee that a user's anonymity will be protected with high probability. In a given run. 
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a user's anonymity might be protected or corrupted. If the probability of the event 
that a user's anonymity is corrupted is very small, i.e., the set of runs where her 
anonymity is not protected is assigned a very small probability by the measure fi, 
this might be enough of a guarantee for the user to interact with the system. 

Recall that we said that i maintains total anonymity with respect to j if the 
fact If = 9{i,a) Ai'^j is true at every point in the system. Total 

anonymity is compromised in a run r if at some point (r, m), holds. Therefore, 
the set of runs where total anonymity is compromised is simply er{-'(p), using the 
notation of the previous section. If ii{er{-^<f)) is very small, then i maintains total 
anonymity with very high probability. This analysis can obviously be extended to 
all the other definitions of anonymity given in previous sections. 

Bounds such as these are useful for analyzing real-world systems. The Crowds 
system | Reiter and Rubin 1998], for example, uses randomization when routing 
communication traffic, so that anonymity is protected with high probability. The 
probabilistic guarantees provided by Crowds were analyzed formally by Shmatikov 
(2002 1, using a probabilistic model checker, and he demonstrates how the anonymity 
guai^antees provided by the Crowds system change as more users (who may be ei- 
ther honest or corrupt) are added to the system. Shmatikov uses a temporal proba- 
bilistic logic to express probabilistic anonymity properties, so these properties can 
be expressed in our system framework. (It is straightforward to give semantics to 
temporal operators in systems; see jFagin, Halpern, Moses, and Vardi 19'95] .) In 
any case, Shmatikov 's analysis of a real-world anonymity system is a useful exam- 
ple of how the formal methods that we advocate can be used to specify and verify 
properties of real-world systems. 

5 Related Work 

5.1 Knowledge-based Definitions of Anonymity 

As mentioned in the introduction, we are not the first to use knowledge to han- 
dle definitions of security, information hiding, or even anonymity. Anonymity has 
been formalized using epistemic logic by Syverson and Stubblebine 1 1999 1. Like 
us, they use epistemic logic to characterize a number of information-hiding require- 
ments that involve anonymity. However, the focus of their work is very different 
from ours. They describe a logic for reasoning about anonymity and a number of 
axioms for the logic. An agent's knowledge is based, roughly speaking, on his 
recent actions and observations, as well as what follows from his log of system 
events. The first five axioms that Syverson and Stubblebine give are the standard 
S5 axioms for knowledge. There are well-known soundness and completeness re- 
sults relating the S5 axiom system to Kripke structure semantics for knowledge 
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|Fagin, Halpem, Moses, and Vardi 1995) . However, they give many more axioms, 
and they do not attempt to give a semantics for which their axioms are sound. Our 
focus, on the other hand, is completely semantic. We have not tried to axiomatize 
anonymity. Rather, we try to give an appropriate semantic framework in which to 
consider anonymity. 

In some ways, Syverson and Stubblebine's model is more detailed than the 
model used here. Their logic includes many formulas that represent various actions 
and facts, including the sending and receiving of messages, details of encryption 
and keys, and so on. They also make more assumptions about the local state of 
a given agent, including details about the sequence of actions that the agent has 
performed locally, a log of system events that have been recorded, and a set of 
facts of which the agent is aware. While these exti^a details may accurately reflect 
the nature of agents in real-world systems, they are orthogonal to our concerns 
here. In any case, it would be easy to add such expressiveness to our model as 
well, simply by including these details in the local states of the various agents. 

It is straightforward to relate our definitions to those of Syverson and Stub- 
blebine. They consider facts of the form ip{i), where z is a principal, i.e., an agent. 
They assume that the fact ip{i) is a single formula in which a single agent name 
occurs. Clearly, 6{i,a) is an example of such a formula. In fact, Syverson and 
Stubblebine assume that if ip{i) and ip{j) are both true, then i = j. For the 6{i, a) 
formulas, this means that 6{i,a) and 9{i',a) cannot be simultaneously true: at 
most one agent can perform an action in a given run, exactly as in the setup of 
Proposition 

There is one definition in fSyve rson and Stubblebine 1999[ that is especially 
relevant to our discussion; the other relevant definitions presented there are similar. 
A system is said to satisfy (> k) -anonymity if the following formula is vahd for 
some observer o: 

ip{i) ^ Po(^W) A Poiv^ik)) A • • • A Po(^(ifc_i)). 

This definition says that if ip{i) holds, there must be at least k agents, including i, 
that the observer suspects. (The existential quantification of the agents ii, . . . , in-i 
is implicit.) The definition is essentially equivalent to our definition of {k — 1)- 
anonymity. It certainly implies that there are A; — 1 agents other than i for which 
might be true. On the other hand, if Po(v(^')) true for A; — 1 agents other 
than i, then the formula must hold, because {p{i) =^ Po{ip{i)) is valid. 

5.2 CSP and Anonymity 

A great deal of work on the foundations of computer security has used process 
algebras such as CCS and CSP LMilner 19801 IHoare 198 5 1 as the basic system 
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framework IFocardi and Gorrieri 20011 ISchneider 1996II . Process algebras offer 
several advantages: they are simple, they can be used for specifying systems as 
well as system properties, and model-checkers are available that can be used to 
verify properties of systems described using their formalisms. 

Schneider and Sidiropoulos [ 1996 1 use CSP both to characterize one type of 
anonymity and to describe variants of the dining cryptographers problem IChaum 1 9881. 
They then use a model-checker to verify that their notion of anonymity holds for 
those variants of the problem. To describe their approach, we need to outline some 
of the basic notation and semantics of CSP. To save space, we give a simplified 
treatment of CSP here. (See Hoare 119851 for a complete description of CSP.) The 
basic unit of CSP is the event. Systems are modeled in terms of the events that 
they can perform. Events may be built up several components. For example, "do- 
nate.$5" might represent a "donate" event in the amount of $5. Processes are the 
systems, or components of systems, that are described using CSP. As a process un- 
folds or executes, various events occur. For our purposes, we make the simplifying 
assumption that a process is determined by the event sequences it is able to engage 
in. 

We can associate with every process a set of traces. Intuitively, each trace in 
the set associated with process P represents one sequence of events that might 
occur during an execution of P. Informally, CSP event traces correspond to finite 
prefixes of runs, except that they do not explicitly describe the local states of agents 
and do not explicitly describe time. 

Schneider and Sidiropoulos define a notion of anonymity with respect to a set 
A of events. Typically, A consists of events of the form i.a for a fixed action a, 
where i is an agent in some set that we denote I a- Intuively, anonymity with respect 
to A means that if any event in A occurs, it could equally well have been any other 
event in A. In particular, this means that if an agent in Ia performs a, it could 
equally well have been any other agent in I a- Formally, given a set S of possible 
events and ACS, let Ja be a function on traces that, given a trace r, returns a 
trace /^(t) that is identical to r except that every event in A is replaced by a fixed 
event a ^ S. A process P is strongly anonymous on A if fA^[fA{P)) = P, where 
we identify P with its associated set of traces. This means that all the events in A 
are interchangeable; by replacing any event in A with any other we would still get 
a valid trace of P. 

Schneider and Sidiropoulos give several very simple examples that are useful 
for clarifying this definition of anonymity. One is a system where there are two 
agents who can provide donations to a charity, but where only one of them will ac- 
tually do so. Agent 0, if she gives a donation, gives $5, and agent 1 gives $10. This 
is followed by a "thanks" from the charity. The events of interest are "O.gives" and 
"1. gives" (representing events where and 1 make a donation), "$5" and "$10" 
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(representing the charity's receipt of the donation), "thanks", and "STOP" (to sig- 
nify that the process has ended). There are two possible traces: 

1. O.gives ^ $5 ^ "thanks" ^ STOP. 

2. 1. gives $10 ^ "thanks" ^ STOP. 

The donors require anonymity, and so we require that the CSP process is strongly 
anonymous on the set {O.gives, 1. gives}. In fact, this condition is not satisfied 
by the process, because "O.gives" and "1. gives" are not interchangeable. This is 
because "O.gives" must be followed by "$5", while "l.gives" must be followed by 
"$10". Intuitively, an agent who observes the traces can determine the donor by 
looking at the amount of money donated. 

We believe that Schneider and Sidiropoulos's definition is best understood as 
trying to capture the intuition that an observer who sees all the events generated 
by P, except for events in A, does not know which event in A occurred. We can 
make this precise by translating Schneider and Sidiropoulos's definition into our 
framework. The first step is to associate with each process P a corresponding set 
of runs TZp. We present one reasonable way of doing so here, which suffices for 
our purposes. In future work, we hope to explore the connection between CSP and 
the runs and systems framework in more detail. 

Recall that a run is an infinite sequence of global states of the form (sg, si , . . . , Sn), 
where each Si is the local state of agent i, and Sg is the state of the environment. 
Therefore, to specify a set of runs, we need to describe the set of agents, and then 
explain how to derive the local states of each agent for each run. There is an obvi- 
ous problem here: CSP has no analogue of agents and local states. To get around 
this, we could simply tag all events with an agent (as Schneider and Sidiropoulos 
in fact do for the events in A). However, for our current purposes, a much simpler 
approach will do. The only agent we care about is a (possibly mythical) observer 
who is able to observe every event except the ones in A. Moreover, for events in 
A, the observer knows that something happened (although not what). There may 
be other agents in the system, but their local states are irrelevant. We formahze this 
as follows. 

Fix a process P over some set S of events, and let ^ C S. Following Schneider 
and Sidiropoulos, for the purposes of this discussion, assume that A consists of 
events of the form i.a, where i £ Ia and a is some specific action. We say that a 
system TZ is compatible with P if there exists some agent o such that the following 
two conditions hold: 

• for every run r £ TZ and every time m, there exists a trace t £ P such that 

r = re(m) and /a(t) = ro(m); 
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• for every trace t £ P, there exists a run r £ TZ such that re(|T|) = r and 
'"od''"!) = (where |r| is the number of events in r). 

Intuitively, TZ represents P if (1) for every trace r in P, there is a point (r, m) in 7?- 
such that, at this point, exactly the events in r have occurred (and are recorded in 
the environment's state) and o has observed fA{T), and (2) for every point (r, m) 
in IZ, there is a trace r in P such that precisely the events in re{m) have happened 
in T, and o has observed at (r, m). We say that the interpreted system Z = 

{TZ, tt) is compatible with P if 7^ is compatible with P and if (T, r, m) ^ 6{i, a) 
whenever the event i.a is in the event sequence rf,{m') for some m' . 

We are now able to make a formal connection between our definition of anonymity 
and that of Schneider and Sidiropoulos. As in the setup of Proposition we as- 
sume that an anonymous action a can be performed only once in a given run. 

Theorem 5.1: If I = (TZ, vr) is compatible with P, then P is strongly anonymous 
on the alphabet A if and only if for every agent i € Ia> the action a performed by 
i is anonymous up to I a with respect to o in Z. 

Proof: Suppose that P is strongly anonymous on the alphabet A and that i G Ia- 
Given a point (r, m), suppose that (T, r,m) \= 6{i,a), so that the event i.a appears 
in re(n) for some n > m. We must show that {Z,r,m) \= Po[9{i',a)] for every 
i' G Ia, that is, that a is anonymous up to Ia with respect to o. For any i' G Ia, 
this requires showing that there exists a point (r', m') such that ro{m) = r'^{m'), 
and r'^{n') includes i' .a, for some n' > m'. Because TZ is compatible with P, 
there exists t £ P such that r = re(n) and i.a appears in r. Let r' be the trace 
identical to r except that i.a is replaced by i' .a. Because P is strongly anonymous 
on ^, P = f^^{fA{P)), and r' G P. By compatibility, there exists a run r' 
such that rg(n) = r' and r'ij{n) = fAi^')- By construction, /a('7") = fA{T'), so 
i"o{ri) = r'g{n). Because the length-m trace prefixes of /a(t) and fAi^') are the 
same, it follows that ro{m) = r'^{m). Because {Z,r',m) j= 9{i',a), {Z,r,m) \= 
Po[0{i',a)] as required. 

Conversely, suppose that for every agent i £ Ia, the action a performed by i 
is anonymous up to I a with respect to o in Z. We must show that P is strongly 
anonymous. It is clear that P C f^^(fA{P)), so we must show only that P ^ 
So suppose that r G fJ^{fA{P))- If no event i.a appears in r, for 
any i € Ia, then t ^ P trivially. Otherwise, some i.a. does appear. Because 
r G /^^(/a(P)), there exists a trace t' ^ P that is identical to r except that i' .a 
replaces i.a, for some other i' G /a- Because TZ is compatible with P, there exists 
a run r' £ R such that rQ(m) = fAiT') and r'^{m) = r' (where m = jr'j). Clearly 
(X, r', m) 1= a) so, by anonymity, (X, r', m) |= Po[6'(i, a)], and there exists a 
run r such that ro{m) = r'^{m) and (X, r, m) |= a). Because the action a can 
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be performed at most once, the trace re{m) must be equal to r. By compatibility, 
r G P as required. | 

Up to now, we have assumed that the observer o has access to all the infor- 
mation in the system except which event in A was performed. Schneider and 
Sidiropoulos extend their definition of strong anonymity to deal with agents that 
have somewhat less information. They capture "less information" using abstrac- 
tion operators. Given a process P, there are several abstraction operators that can 
give us a new process. For example the hiding operator, represented by \, hides 
all events in some set C. That is, the process P\C is the same as P except that 
all events in C become internal events of the new process, and are not included in 
the traces associated with P\C. Another abstraction operator, the renaming oper- 
ator, has already appeared in the definition of strong anonymity: for any set C of 
events, we can consider the function fc that maps events in C to a fixed new event. 
The difference between hiding and renaming is that, if events in C are hidden, the 
observer is not even aware they took place. If events in C are renamed, then the 
observer is aware that some event in C took place, but does not know which one. 

Abstraction operators such as these provide a useful way to model a process 
or agent who has a distorted or limited view of the system. In the context of 
anonymity, they allow anonymity to hold with respect to an observer with a limited 
view of the system in cases where it would not hold with respect to an observer who 
can see everything. In the anonymous donations example, hiding the events $5 and 
$10, i.e., the amount of money donated, would make the new process P\{$5, $10} 
strongly anonymous on the set of donation events. Formally, given an abstraction 
operator ABSc on a set of events C, we have to check the requirement of strong 
anonymity on the process ABSc{P) rather than on the process P. 

Abstraction is easily captured in our framework. It amounts simply to changing 
the local state of the observer. For example, anonymity of the process P\C in our 
framework corresponds to anonymity of the action a for every agent in I a with 
respect to an observer whose local state at the point (r, m) is fA{re{m))\C . We 
omit the obvious analogue of Theorem l5. ll here. 

A major advantage of the runs and systems framework is that definitions of 
high-level properties such as anonymity do not depend on the local states of the 
agents in question. If we want to model the fact that an observer has a limited 
view of the system, we need only modify her local state to reflect this fact. While 
some limited views are naturally captured by CSP abstraction operators, others 
may not be. The definition of anonymity should not depend on the existence of 
an appropriate abstraction operator able to capture the limitations of a particular 
observer. 

As we have demonstrated, our approach to anonymity is compatible with the 
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approach taken in |Schneider and Sidiropoulos 1996 1. Our definitions are stated in 
terms of actions, agents, and knowledge, and are thus very intuitive and liexible. 
The generality of runs and systems allows us to have simple definitions that apply 
to a wide variety of systems and agents. The low-level CSP definitions, on the other 
hand, are more operational than ours, and this allows easier model-checking and 
verification. Furthermore, there are many advantages to using process algebras in 
general: systems can often be represented much more succinctly, and so on. This 
suggests that both approaches have their advantages. Because CSP systems can be 
represented in the runs and systems framework, however, it makes perfect sense 
to define anonymity for CSP processes using the knowledge-based definitions we 
have presented here. If our definitions turn out to be equivalent to more low-level 
CSP definitions, this is ideal, because CSP model-checking programs can then be 
used for verification. A system designer simply needs to take care that the runs- 
based system derived from a CSP process (or set of processes) represents the local 
states of the different agents appropriately. 



5.3 Anonymity and Function View Semantics 

Hughes and Shmatikov ['2004| introduce function views and function-view opaque- 
ness as a way of expressing a variety of information-hiding properties in a succinct 
and uniform way. Their main insight is that requirements such as anonymity in- 
volve restrictions on relationships between entities such as agents and actions. Be- 
cause these relationships can be expressed by functions from one set of entities 
to another, hiding information from an observer amounts to limiting an observer's 
view of the function in question. For example, anonymity properties are concerned 
with whether or not an observer is able to connect actions with the agents who 
performed them. By considering the function from the set of actions to the set of 
agents who performed those actions, and specifying the degree to which that func- 
tion must be opaque to observers, we can express anonymity using the function- 
view approach. 

To model the uncertainty associated with a given function, Hughes and Shmatikov 
define a notion oi function knowledge to explicitly represent an observer's partial 
knowledge of a function. Function knowledge focuses on three particular aspects 
of a function: its graph, image, and kernel. (Recall that the kernel of a function / 
with domain X is the equivalence relation ker on X defined by (x, x') € ker iff 
f{x) = f{x').) Function knowledge of type X ^ y is a triple N = {F,I,K), 
where F C X x y, / C y, and K is an equivalence relation on X. A triple 
(F, /, K) is consistent with f ii f Q F, I Q imf, and K C kerf. Intuitively, 
a triple (F, /, K) that is consistent with / represents what an agent might know 
about the function /. Complete knowledge of a function /, for example, would be 
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represented by the triple (/, imf, kerf). 

For anonymity, and for information hiding in general, we are interested not in 
what an agent knows, but in what an agent does not know. This is formalized by 
Hughes and Shmatikov in terms of opaqueness conditions for function knowledge. 
If = {F, I, K) is consistent with f : X ^ Y, then, for example, is k-value 
opaque if |-F(a;)| > k for all x G X. That is, is /c-value opaque if there are k 
possible candidates for the value of /(x), for all x & X. Similarly, is Z -value 
opaque if Z CI F{x) for all x € X. In other words, for each x in the domain of /, 
no element of Z can be ruled out as a candidate for /(x). Finally, A^ is absolutely 
value opaque if that A^ is F-value opaque. 

Opaqueness conditions are closely related to the nonprobabilistic definitions 
of anonymity given in Section |3l Consider functions from XtoY, where X is 
a set of actions and y is a set of agents, and suppose that some function / is the 
function that, given some action, names the agent who performed the action. If 
we have /c-value opaqueness for some view of / (corresponding to some observer 
o), this means, essentially, that each action a in X is A;-anonymous with respect 
to a. Similarly, the view is lyi-value opaque if the action is anonymous up to Ia 
for each agent i CL Ia- Thus, function view opaqueness provides a concise way of 
describing anonymity properties, and information-hiding properties in general. 

To make these connections precise, we need to explain how function views 
can be embedded within the runs and systems framework. Hughes and Shmatikov 
already show how we can define function views using Kripke structures, the stan- 
dard approach for giving semantics to knowledge. A minor modification of their 
approach works in systems too. Assume we are interested in who performs an ac- 
tion a € X, where X, intuitively, is a set of "anonymous actions". Let Y be the 
set of agents, including a "nobody agent" denoted N, and let / be a function from 
X to Y . Intuitively, /(a) = i if agent i performs action a, and /(a) = A'^ if no 
agent performs action a. The value of the function / will depend on the point. Let 
fr^m be the value of / at the point (r, m). Thus, fr,m{o) = « if z performs a in run 
r. ^ We can now easily talk about function opaqueness with respect to an observer 
0. For example, / is Z- value opaque at the point (r, m) with respect to o if, for all 
z G Z, there exists a point (r', m') such that r'^{m') = ro{m) and f[r',m'){x) = -2- 
In terms of knowledge, Z- value opaqueness says that for any value x in the range 
of /, o thinks it possible that any value z G Z could be the result of /(x). In- 
deed, Hughes and Shmatikov say that function-view opaqueness, defined in terms 
of Kripke structure semantics, is closely related to epistemic logic. The following 
proposition makes this precise; it would be easy to state similar propositions for 

^Note that for f(r,m) to be well-defined, it must be the case that only one agent can ever perform 
a single action. 
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other kinds of function-view opaqueness. 

Proposition 5.2: LetZ = (TZ, vr) be an interpreted system that satisfies (I, r, m) \= 
f{x) = y whenever f\r,m){^) = U- system I, f is Z-value opaque for observer 
o at the point (r, m) if and only if 

{I,r,m) N A A Po[f{x) = z]. 

Proof: This result follows immediately from the definitions. | 

Stated in terms of knowledge, function-view opaqueness already looks a lot 
like our definitions of anonymity. Given / (or, more precisely, the set {/(r-,m)} of 
functions) mapping actions to agents, we can state a theorem connecting anonymity 
to function-view opaqueness. There are two minor issues to deal with, though. 
First, our definitions of anonymity are stated with respect to a single action a, 
while the function / deals with a set of actions. We can deal with this by taking the 
domain of / to be the singleton {a}. Second, our definition of anonymity up to a 
set I A requires the observer to suspect agents in I a only if i actually performs the 
action a. (Recall this is also true for Syverson and Stubblebine's definitions.) Ia- 
value opaqueness requires the observer to think many agents could have performed 
an action even if nobody has. To deal with this, we require opaqueness only when 
the action has been performed by one of the agents in Ia- 

Theorem 5.3: Suppose that {Z,r,m) \= 9{i,a) exactly if f[r,m)iO') = ^- Then 
action a is anonymous up to I a with respect to ofor each agent i & Ia if and only 
if at all points (r, m) such that f(^r,m){o) € I a, f is lA'Value opaque with respect 
to o. 

Proof: Suppose that / is /^-value opaque, and let i S be given. If (X, r, m) \= 
9{i,a), then f(r,m){o) = i- We must show that, for all i' G I a, {T,r,m) \= 
Po[9{i', a)]. Because / is J^- value opaque at (r, m), there exists a point (r', m') 
such that r'^{m') = ro(m) and f(r'm')(.o.) = Because (Z, r',m') \= 9{i',a), 
{I,r,m)^Po[e{i',a)]. 

Conversely, suppose that for each agent i G a is anonymous up to I a 
with respect to o. Let (r, m) be given such that /(r,m) i^^) ^ I A, and let that i = 
f{r,m){o,)- It follows that {2,r,m) \= 9{i,a). For any i' S Ia, {I,r,m) \= 
Po[9{i',a)], by anonymity. Thus there exists a point {r' ,m') such that r'^{m') = 
ro{m) and {I,r',m') \= 6{i',a). It follows that f{r',m'){c) = and that / is 
/^-value opaque. I 

As with Proposition 15.21 it would be easy to state analogous theorems con- 
necting our other definitions of anonymity, including minimal anonymity, total 
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anonymity, and fc-anonymity, to other forms of function-view opaqueness. We 
omit the details here. 

The assumptions needed to prove Theorem 15.31 illustrate two ways in which 
our approach may seem to be less general than the function-view approach. First, 
all our definitions are given with respect to a single action, rather than with respect 
to a set of actions. However, it is perfectly reasonable to specify that all actions in 
some set A of actions be anonymous. Then we could modify Theorem l5.3l so that 
the function / is defined on all actions in A. (We omit the details.) Second, our 
definitions of anonymity only restrict the observer's knowledge if somebody actu- 
ally performs the action. This is simply a different way of defining anonymity. As 
mentioned previously, we are not trying to give a definitive definition of anonymity, 
and it certainly seems reasonable that someone might want to define or specify 
anonymity using the stronger condition. At any rate, it would be straightforward to 
modify our definitions so that the implications, involving 6{i, a), are not included. 

Hughes and Shmatikov argue that epistemic logic is a useful language for ex- 
pressing anonymity specifications, while CSP is a useful language for describing 
and specifying systems. We agree with both of these claims. They propose func- 
tion views as a useful interface to mediate between the two. We have tried to argue 
here that no mediation is necessary, since the multiagent systems framework can 
also be used for describing systems. (Indeed, the traces of CSP can essentially be 
viewed as runs.) Nevertheless, we do believe that function views can be the ba- 
sis of a useful language for reasoning about some aspects of information hiding. 
We can well imagine adding abbreviations to the language that let us talk directly 
about function views. (We remark that we view these abbreviations as syntactic 
sugar, since these are notions that can already be expressed directly in terms of the 
knowledge operators we have introduced.) 

On the other hand, we believe that function views are not expressive enough 
to capture all aspects of information hiding. One obvious problem is adding prob- 
ability. While it is easy to add probability to systems, as we have shown, and to 
capture interesting probabilistic notions of anonymity, it is far from clear how to 
do this if we take function views triples as primitive. 

To sum up, we would argue that to reason about knowledge and probability, 
we need to have possible worlds as the underlying semantic framework. Using 
the multiagent systems approach gives us possible worlds in a way that makes 
it particularly easy to relate them to systems. Within this semantic framework, 
function views may provide a useful syntactic construct with which to reason about 
information hiding. 
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6 Discussion 



We have described a framework for reasoning about information hiding in multia- 
gent systems, and have given general definitions of anonymity for agents acting in 
such systems. We have also compared and contrasted our definitions to other sim- 
ilar definitions of anonymity. Our knowledge-based system framework provides a 
number of advantages: 

• We are able to state information-hiding properties succinctly and intuitively, 
and in terms of the knowledge of the observers or attackers who interact with 
the system. 

• Our system has a well-defined semantics that lets us reason about knowledge 
in systems of interest, such as systems specified using process algebras or 
strand spaces. 

• We are able to give straightforward probabilistic definitions of anonymity, 
and of other related information-hiding properties. 

There are a number of issues that this paper has not addressed. We have focused 
almost exclusively on properties of anonymity, and have not considered related no- 
tions, such as pseudonymity and unlinkability [Hug hes and Shmatikov 2004t|Pfitzmann and Kohntopp 200 1| . 
There seems to be no intrinsic difficulty capturing these notions in our framework. 
For example, one form of message unlinkability specifies that no two messages 
sent by an anonymous sender can be "linked", in the sense that an observer can 
determine that both messages were sent by the same sender. More formally, two 
actions a and a' are linked with respect to an observer o if o knows that there exists 
an agent i who performed both a and a' . This definition can be directly captured 
using knowledge. Its negation says that o considers it possible that there exist two 
distinct agents who performed a and a'; this can be viewed as a definition of min- 
imal unlinkability. This minimal requirement can be strengthened, exactly as our 
definitions of anonymity were, to include larger numbers of distinct agents, proba- 
bility, and so on. Although we have not worked out the details, we believe that our 
approach will be similarly applicable to other definitions of information hiding. 

Another obviously important issue is checking whether a given system speci- 
fies the knowledge-based properties we have introduced. The standard technique 
for doing this is model checking. Recent work on the problem of model checking 
in the multiagent systems framework suggests that this may be viable. Van der 
Meyden 1 1998 1 discusses algorithms and complexity results for model checking a 
wide range of epistemic formulas in the runs and systems framework, and van der 
Meyden and Su I2004II use these results to verify the dining cryptographers proto- 
col liChaum 19881 . using formulas much like those described in Section 1331 Even 
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though model checking of formulas involving knowledge seems to be intractable 
for large problems, these results are a promising first step towards being able to 
use knowledge for both the specification and verification of anonymity properties. 
Shmatikov 1 2002 1, for example, analyzes the Crowds system using the probabilistic 
model checker PRISM | Kwiatk owska, Norman, and Parke r 2001 [ . This is a partic- 
ularly good example of how definitions of anonymity can be made precise using 
logic and probability, and how model-checking can generate new insights into the 
functioning of a deployed protocol. 

Finally, it is important to note that the examples considered in this paper do 
not reflect the state of the art for computational anonymity. Anonymity proto- 
cols based on DC-nets, while theoretically interesting, have not been widely de- 
ployed; in practice, protocols based on mixes and message-rerouting are much 
more common. We used the dining cryptographer's problem as a running example 
here mainly because of its simplicity, but it remains to be seen whether our general 
approach will be as illuminating for more complicated protocols. There are reasons 
to believe that it will be. Shmatikov 's analysis of Crowds shows that a logic-based 
approach can be useful for analyzing protocols based on message-rerouting. Fur- 
thermore, we believe that formalizing anonymity protocols using techniques like 
ours is worthwhile even if formal verification is impractical or impossible. It forces 
system designers to think carefully about information-hiding requirements, which 
can often be tricky, and provides a system-independent framework for comparing 
the anonymity guarantees provided by different systems. 

We described one way to generate a set of runs from a CSP process P, ba- 
sically by recording all the events in the state of the environment and describing 
some observer o who is able to observe a subset of the events. This translation was 
useful for comparing our abstract definitions of anonymity to more operational 
CSP-based definitions. In future work we hope to further explore the connections 
between the runs and systems framework and tools such as CCS, CSP, and the spi 
calculus [Abadi and Gordon 1997). In particular, we are interested in the recent 
work of Fournet and Abadi 12003 L who use the applied pi calculus to model pri- 
vate authentication, according to which a principal in a network is able to authen- 
ticate herself to another principal while remaining anonymous to other "nearby" 
principals. A great deal of work in computer security has formalized information- 
hiding properties using these tools. Such work often reasons about the knowledge 
of various agents in an informal way, and then tries to capture knowledge-based 
security properties using one of these formalisms. By describing canonical trans- 
lations from these formalisms to the runs and systems framework, we hope to be 
able to demonstrate formally how such definitions of security do (or do not) capture 
notions of knowledge. 
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